In this module, we will cover the most fundamental concepts associated with color images. These include color spaces, color channels, and some practical considerations associated with reading and displaying color images.
import cv2
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
%matplotlib inline
from IPython.display import Image
plt.rcParams['image.cmap'] = 'gray'
if 'google.colab' in str(get_ipython()):
print("Downloading Code to Colab Environment")
!wget https://www.dropbox.com/sh/8mu8erfnvqk3dzu/AABWmDJkjv-TvECMPRKlufNYa?dl=1 -O module-code.zip -q --show-progress
!unzip -qq module-code.zip
else:
pass
Until now, we have been using grayscale images in our discussion. Let's now start working with color images. We will begin by reading and displaying color images in two different formats. We will also discuss some of the unexpected results that can occur.
We will start with a simple color image below of the Facebook logo in a JPG format.

Color images are typically represented using three separate "color" channels. The specific color channels used to represent a color image depend on the Color Space. One of the most common color spaces is the RGB color space, which contains Red, Green, and Blue channels.
# Read the image.
logo = 'facebook_logo.jpg'
logo_img = cv2.imread(logo, cv2.IMREAD_COLOR)
# Print the size of the image.
print("Image size is ", logo_img.shape)
Image size is (301, 800, 3)
plt.figure(figsize = (10, 10))
plt.imshow(logo_img);
The color displayed above is different from the actual image. This is because matplotlib expects the image to be in RGB format whereas OpenCV stores images in BGR format. Thus, for correct display, we need to reverse the channel order of the image in order to properly render the color of the image.
There are a couple of different approaches to reversing the order of the color channels. The first approach shown below uses a short-hand NumPy array slicing syntax that will reverse the order of the channels in the 3rd dimension of the image array.
# Swap the Red and Blue color channels.
logo_img = logo_img[:, :, ::-1]
# Display the image.
plt.figure(figsize = (10, 10))
plt.imshow(logo_img);
# Read the image.
logo = 'Pytorch_logo.png'
logo_img = cv2.imread(logo, cv2.IMREAD_COLOR)
# Print the size of the image.
print("Image size is ", logo_img.shape)
# Display the image.
plt.figure(figsize = (12, 12))
plt.imshow(logo_img);
Image size is (205, 1025, 3)
The color channels need to be swapped as in the previous example, but there is also a black background that was unexpected.
PNG images support a 4th channel called the "alpha" channel. The alpha channel contains transparency information that allows specific regions within an image to appear transparent. As an example, consider the Facebook logo in the previous section. The logo contains two colors (blue and white). The white letters in the logo are actually white: (255, 255, 255). The PyTorch logo, on the other hand, contains an alpha channel that allows certain regions of the image to appear transparent. So the "white" background is not white. Instead, those pixels are being masked by a 4th (alpha) channel, and are interpreted as transparent. In this case the pixels in the background portion of the image are set to: (0,0,0), which will appear as black unless we include the alpha chanel to mask them. We will cover transparency and alpha masking in a future module in more detail, but it is important to be aware of these details when reading and displaying images.
# Read the image.
logo = "Pytorch_logo.png"
logo_img = cv2.imread(logo, cv2.IMREAD_UNCHANGED)
# Print the size of the image.
print(logo_img.shape)
# Display the image.
plt.figure(figsize = (12, 12))
plt.imshow(logo_img);
(205, 1025, 4)
cvtColor() Converts an image from one color space to another. Note that the default color format in OpenCV is often referred to as RGB but it is actually BGR (the bytes are reversed). So the first byte in a standard (24-bit) color image will be an 8-bit Blue component, the second byte will be Green, and the third byte will be Red. This function can be used to simply swap the order of the Blue and Red channels for the RGB color space, but it can also be used to convert between color spaces as we will see further below.
dst = cv2.cvtColor(src, code)
dst: Is the output image of the same size and depth as src.
The function has 2 required arguments:
src: input image: 8-bit unsigned, 16-bit unsigned ( CV_16UC... ), or single-precision floating-point.code: color space conversion code (see ColorConversionCodes below). cvtColor()
ColorConversionCodes()
We previously showed that swapping the Blue and Red channels can be accomplished using NumPy as follows: img[:, :, ::-1], however, there is a method in OpenCV that can be used for this purpose as well as many other color conversions. Let's use the cvtColor() to swap the channel order.
# Swap the Red and Blue color channels using: cv2.COLOR_BGRA2RGBA.
logo_img = cv2.cvtColor(logo_img, cv2.COLOR_BGRA2RGBA)
# Display the image.
plt.figure(figsize = (12, 12))
plt.imshow(logo_img);
Let's now learn how to split and merge color channels using the split() and merge() functions in OpenCV.
split() Divides a multi-channel array into several single-channel arrays.
merge() Merges several arrays to make a single multi-channel array. All the input matrices must have the same size.
In the example below, we will read an image, split the color channels and plot the individual color channels of the grayscale image to better understand how the individual channels contribute to the color in the original image.
# Split the image into the B,G,R components.
img_bgr = cv2.imread('Emerald_Lakes_New_Zealand.jpg', cv2.IMREAD_COLOR)
b, g, r = cv2.split(img_bgr)
# Show the channels.
plt.figure(figsize = [20, 10])
plt.subplot(141); plt.imshow(r); plt.title('Red Channel')
plt.subplot(142); plt.imshow(g); plt.title('Green Channel')
plt.subplot(143); plt.imshow(b); plt.title('Blue Channel')
# Merge the individual channels into a BGR image.
imgMerged = cv2.merge((r, g, b))
# Display the merged output.
plt.subplot(144)
plt.imshow(imgMerged)
plt.title('Merged Output');
In simple terms, a color space is a specific organization of colors that typically represents the space of all possible human-perceivable colors. A color model is a mathematical construct for how to specify colors in the color space with a unique tuple of numbers (typically as three or four values representing the relative contributions of color components). A color model can be thought of as a mathematical way to navigate a color space. However, it is very common to use the term “color space” to collectively define both a color model along with a specific mapping of that model onto an absolute color space.
As an introduction to color spaces we will consider two commonly used models: the RGB color space (for Red, Green, Blue) and the HSV color space (for Hue, Saturation, Value). Both color spaces use a three-dimensional coordinate system to specify the component colors that represent a unique tuple, and therefore a unique color. These components are also referred to as color channels. Since color images are typically represented by three color channels as 8-bit unsigned integers for each channel, the individual color components can take on values from [0,255]. So we can therefore represent 16.77 Million unique colors in either color space (256 256 256).
In the examples below, we will be working with color images in RGB and HSV.
We can also use cvtColor() to convert from one color space to another. In the example below, we will convert the image data to the HSV color space, split the channels, and display the individual channels as grayscale images.
img_hsv = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2HSV)
# Split the image into the B,G,R components.
h, s, v = cv2.split(img_hsv)
# Display the channels.
plt.figure(figsize = [20, 5])
plt.subplot(141); plt.imshow(h); plt.title('H Channel')
plt.subplot(142); plt.imshow(s); plt.title('S Channel')
plt.subplot(143); plt.imshow(v); plt.title('V Channel')
# Display the original image.
plt.subplot(144); plt.imshow(img_bgr[:, :, ::-1]); plt.title('Original');
h_new = h + 10
img_hsv_merged = cv2.merge((h_new, s, v))
img_rgb_merged = cv2.cvtColor(img_hsv_merged, cv2.COLOR_HSV2RGB)
# Display the channels.
plt.figure(figsize = [20,5])
plt.subplot(141); plt.imshow(h_new); plt.title('H Channel')
plt.subplot(142); plt.imshow(s); plt.title('S Channel')
plt.subplot(143); plt.imshow(v); plt.title('V Channel')
# Display the modified image.
plt.subplot(144); plt.imshow(img_rgb_merged); plt.title('Modified');
Saving images using OpenCV is very straightforward using the function imwrite(). The function saves the image to the specified file. The image format is chosen based on the filename extension (see imread() for the list of extensions). In general, only 8-bit single-channel or 3-channel (with 'BGR' channel order) images can be saved using this function (see the OpenCV documentation below for further details).
cv2.imwrite(filename, img[, params])
The function has 2 required arguments:
filename: This can be an absolute or a relative path. img: Image or Images to be saved.img = cv2.imread('Emerald_Lakes_New_Zealand.jpg', cv2.IMREAD_COLOR)
img_rgb = img[:, :, ::-1]
plt.figure(figsize = (5, 5))
plt.imshow(img_rgb);
cv2.imwrite('SAVED.jpg', img_rgb)
Image('SAVED.jpg', width = '300')
cv2.imwrite('SAVED.jpg', img)
Image('SAVED.jpg', width = '300')
Please complete the code in the cell below.
# Read the saved image above ('Emerald_Lakes_New_Zealand.jpg') as a color image.
# YOUR CODE HERE
# Print the image shape.
# YOUR CODE HERE
# Convert the image to grayscale using cv2.cvtColor().
# YOUR CODE HERE
# Print the image shape.
# YOUR CODE HERE
# Display the image using matplotlib imshow()
# plt.figure(figsize = [10, 10])
# YOUR CODE HERE
Your results should look similar to this.
